289 research outputs found
NeRFs: The Search for the Best 3D Representation
Neural Radiance Fields or NeRFs have become the representation of choice for
problems in view synthesis or image-based rendering, as well as in many other
applications across computer graphics and vision, and beyond. At their core,
NeRFs describe a new representation of 3D scenes or 3D geometry. Instead of
meshes, disparity maps, multiplane images or even voxel grids, they represent
the scene as a continuous volume, with volumetric parameters like
view-dependent radiance and volume density obtained by querying a neural
network. The NeRF representation has now been widely used, with thousands of
papers extending or building on it every year, multiple authors and websites
providing overviews and surveys, and numerous industrial applications and
startup companies. In this article, we briefly review the NeRF representation,
and describe the three decades-long quest to find the best 3D representation
for view synthesis and related problems, culminating in the NeRF papers. We
then describe new developments in terms of NeRF representations and make some
observations and insights regarding the future of 3D representations.Comment: Updated based on feedback in-person and via e-mail at SIGGRAPH 2023.
In particular, I have added references and discussion of seminal SIGGRAPH
image-based rendering papers, and better put the recent Kerbl et al. work in
context, with more reference
Light Field Blind Motion Deblurring
We study the problem of deblurring light fields of general 3D scenes captured
under 3D camera motion and present both theoretical and practical
contributions. By analyzing the motion-blurred light field in the primal and
Fourier domains, we develop intuition into the effects of camera motion on the
light field, show the advantages of capturing a 4D light field instead of a
conventional 2D image for motion deblurring, and derive simple methods of
motion deblurring in certain cases. We then present an algorithm to blindly
deblur light fields of general scenes without any estimation of scene geometry,
and demonstrate that we can recover both the sharp light field and the 3D
camera motion path of real and synthetically-blurred light fields.Comment: To be presented at CVPR 201
Image to Image Translation for Domain Adaptation
We propose a general framework for unsupervised domain adaptation, which
allows deep neural networks trained on a source domain to be tested on a
different target domain without requiring any training annotations in the
target domain. This is achieved by adding extra networks and losses that help
regularize the features extracted by the backbone encoder network. To this end
we propose the novel use of the recently proposed unpaired image-toimage
translation framework to constrain the features extracted by the encoder
network. Specifically, we require that the features extracted are able to
reconstruct the images in both domains. In addition we require that the
distribution of features extracted from images in the two domains are
indistinguishable. Many recent works can be seen as specific cases of our
general framework. We apply our method for domain adaptation between MNIST,
USPS, and SVHN datasets, and Amazon, Webcam and DSLR Office datasets in
classification tasks, and also between GTA5 and Cityscapes datasets for a
segmentation task. We demonstrate state of the art performance on each of these
datasets
Learning to Synthesize a 4D RGBD Light Field from a Single Image
We present a machine learning algorithm that takes as input a 2D RGB image
and synthesizes a 4D RGBD light field (color and depth of the scene in each ray
direction). For training, we introduce the largest public light field dataset,
consisting of over 3300 plenoptic camera light fields of scenes containing
flowers and plants. Our synthesis pipeline consists of a convolutional neural
network (CNN) that estimates scene geometry, a stage that renders a Lambertian
light field using that geometry, and a second CNN that predicts occluded rays
and non-Lambertian effects. Our algorithm builds on recent view synthesis
methods, but is unique in predicting RGBD for each light field ray and
improving unsupervised single image depth estimation by enforcing consistency
of ray depths that should intersect the same scene point. Please see our
supplementary video at https://youtu.be/yLCvWoQLnmsComment: International Conference on Computer Vision (ICCV) 201
Creating Generative Models from Range Images
We describe a new approach for creating concise high-level generative models from one or more approximate range images. Using simple acquisition techniques and a user-defined class of models, our method produces a simple and intuitive object description that is relatively insensitive to noise and is easy to manipulate and edit. The algorithm has two inter-related phases -- recognition, which chooses an appropriate model within a given hierarchy, and parameter estimation, which adjusts the model to fit the data. We give a simple method for automatically making tradeoffs between simplicity and accuracy to determine the best model. We also describe general techniques to optimize a specific generative model. In particular, we address the problem of creating a suitable objective function that is sufficiently continuous for use with finite-difference based optimization techniques. Our technique for model recovery and subsequent manipulation and editing is demonstrated on real objects -- a spoon, bowl, ladle, and cup -- using a simple tree of possible generative models. We believe that higher-level model representations are extremely important, and their recovery for actual objects is a fertile area of research towards which this thesis is a step. However, our work is preliminary and there are currently several limitations. The user is required to create a model hierarchy (and supply methods to provide an initial guess for model parameters within this hierarchy); the use of a large pre-defined class of models can help alleviate this problem. Further, we have demonstrated our technique on only a simple tree of generative models. While our approach is fairly general, a real system would require a tree that is significantly larger. Our methods work only where the entire object can be accurately represented as a single generative model; future work could use constructive solid geometry operations on simple generative models to represent more complicated shapes. We believe that many of the above limitations can be addressed in future work, allowing us to easily acquire and process three-dimensional shape in a simple, intuitive and efficient manner
Recommended from our members
A First Order Analysis of Lighting, Shading, and Shadows
The shading in a scene depends on a combination of many factors---how the lighting varies spatially across a surface, how it varies along different directions, the geometric curvature and reflectance properties of objects, and the locations of soft shadows. In this paper, we conduct a complete first order or gradient analysis of lighting, shading and shadows, showing how each factor separately contributes to scene appearance, and when it is important. Gradients are well suited for analyzing the intricate combination of appearance effects, since each gradient term corresponds directly to variation in a specific factor. First, we show how the spatial {\em and} directional gradients of the light field change, as light interacts with curved objects. This extends the recent frequency analysis of Durand et al.\ to gradients, and has many advantages for operations, like bump-mapping, that are difficult to analyze in the Fourier domain. Second, we consider the individual terms responsible for shading gradients, such as lighting variation, convolution with the surface BRDF, and the object's curvature. This analysis indicates the relative importance of various terms, and shows precisely how they combine in shading. As one practical application, our theoretical framework can be used to adaptively sample images in high-gradient regions for efficient rendering. Third, we understand the effects of soft shadows, computing accurate visibility gradients. We generalize previous work to arbitrary curved occluders, and develop a local framework that is easy to integrate with conventional ray-tracing methods. Our visibility gradients can be directly used in practical gradient interpolation methods for efficient rendering
Dynamic Splines with Constraints for Animation
In this paper, we present a method for fast interpolation between animation keyframes that allows for automatic computer-generated "improvement" of the motion. Our technique is closely related to conventional animation techniques, and can be used easily in conjunction with them for fast improvements of "rough" animations or for interpolation to allow sparser keyframing. We apply our technique to construction of splines in quaternion space where we show 100-fold speed-ups over previous methods. We also discuss our experiences with animation of an articulated human-like figure. Features of the method include: (1) Development of new subdivision techniques based on the Euler-Lagrange differential equations for splines in quaternion space; (2) An intuitive and simple set of coefficients to optimize over which is different from the conventional Bspline coefficients; (3) Widespread use of unconstrained minimization as opposed to constrained optimization needed by many previous methods. This speeds up the algorithm significantly, while still maintaining keyframe constraints accurately
Recommended from our members
Efficient Shadows from Sampled Environment Maps
This paper addresses the problem of efficiently calculating shadows from environment maps. Since accurate rendering of shadows from environment maps requires hundreds of lights, the expensive computation is determining visibility from each pixel to each light direction, such as by ray-tracing. We show that coherence in both spatial and angular domains can be used to reduce the number of shadow rays that need to be traced. Specifically, we use a coarse-to-fine evaluation of the image, predicting visibility by reusing visibility calculations from four nearby pixels that have already been evaluated. This simple method allows us to explicitly mark regions of uncertainty in the prediction. By only tracing rays in these and neighboring directions, we are able to reduce the number of shadow rays traced by up to a factor of 20 while maintaining error rates below 0.01%. For many scenes, our algorithm can add shadowing from hundreds of lights at twice the cost of rendering without shadows
- …